Do not delete triples from rdf_unlinked_shared_data#258
Merged
hannahbast merged 1 commit intomainfrom Jan 31, 2026
Merged
Conversation
So far, triples from `rdf_unlinked_shared_data` were deleted along with the triples from `rdf_deleted_data`. This is a mistake because the triples from `rdf_unlinked_shared_data` might still be relevant for other entities. We now simply keep them, even it that means that may end up as orphaned triples at some time. The documentation of the Wikidata update stream explicitly allows (and even encourages) this behavior. Fixes ad-freiburg/qlever#2670
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
So far, triples from
rdf_unlinked_shared_datawere deleted along with the triples fromrdf_deleted_data. This is a mistake because the triples fromrdf_unlinked_shared_datamight still be relevant for other entities. We now keep all of them, even if they may end up as orphaned triples at some point. The documentation of the Wikidata update stream explicitly allows (and even encourages) this behavior. Fixes ad-freiburg/qlever#2670On the side, no longer insert
wikibase:Dump schema:dateModified DATEtriples. They were confusing because the semantics of the triples in the dump (produced by the Wikidata dump process) was that the minimum had to be taken to get a date until when all updates were considered, whereas for the triples that used to be inserted during the live update (by theqlever update-wikidatacommand) the semantics was to take the maximum. Instead, there is now only thewikibase:Dump wikibase:updatesCompleteUntil DATEandwikibase:Dump wikibase:updateStreamNextOffset OFFSETtriples, which have clear semantics (captured perfectly in the predicate names)